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Abstract. We prove exact bounds on the time complexity of distributed graph 
colouring. If we are given a directed path that is properly coloured with n colours, 
by prior work it is known that we can find a proper 3-colouring in ^ log*(n) ± 0(1) 
communication rounds. We close the gap between upper and lower bounds: we show 
that for infinitely many n the time complexity is precisely ^ log* n communication 
rounds. 



1 Introduction 


One of the key primitives in the area of distributed graph algorithms is graph colouring 
in directed paths. This is a fundamental symmetry-breaking task, widely studied since 
the 1980s—it is used as a subroutine in numerous efficient distributed algorithms, and 
it also serves as a convenient starting point in many lower-bound proofs. In the 1990s it 
was already established that the distributed computational complexity of this problem 
is \ log*(n) ±0(1) communication rounds [3,13]. We are now able to give exact bounds 
on the distributed time complexity of this problem, and the answer turns out to take a 
surprisingly elegant form: 

Theorem 1. For infinitely many values ofn, it takes exactly \ log* n rounds to compute 
a 3- colouring of a directed path. 

1.1 Problem Setting 

Throughout this work we focus on deterministic distributed algorithms. As is common 
in this context, what actually matters is not the number of nodes but the range of their 
labels. For the sake of concreteness, we study precisely the following problem setting: 

We have a path or a cycle with any number of nodes, and the nodes are 
properly coloured with colours from [n] = {1,2 ,n}. 

The techniques that we present in this work can also be used to analyse other variants of 
the problem- for example, a cycle with n nodes that are labelled with some permutation 
of [n], or a path with at most n nodes that are labelled with unique identifiers from [n]. 
However, the exact bounds on the time complexity will slightly depend on such details. 

We will assume that there is a globally consistent orientation in the path: each 
node has at most one predecessor and at most one successor. Our task is to find a 
proper colouring of the path with c colours, for some number c > 3. We will call this 
task colour reduction from n to c. 

We will use the following model of distributed computing. Each node of the graph 
is a computational entity. Initially, each node knows the global parameters n and c, its 
own label from [n] , its degree, and the orientations of its incident edges. Computation 
takes place in synchronous communication rounds. In each round, each node can send a 
message to each of its neighbours, receive a message from each of its neighbours, update 
its state, and possibly stop and output its colour. The running time of an algorithm is 
defined to be the number of communication rounds until all nodes have stopped. We 
will use the following notation: 

• C(n, c) is the time complexity of colour reduction from n to c. 

• T(n, c) is the time complexity of colour reduction from n to c if we restrict the 
algorithm so that a node can only send messages to its successor. We call such 
an algorithm one-sided , while unrestricted algorithms are two-sided. 

We can compose colour reduction algorithms, yielding C(a, c ) < C(a , b ) + C{b , c) and 
T(a, c) < T(a, b ) + T(b, c ) for any a >b > c. It is easy to see (shown in Lemma 2) that 

C(n,c ) = \T(n,c)/ 2]. 

We will be interested primarily in C(n , c), but function T(n, c) is much more convenient 
to analyse when we prove upper and lower bounds. 
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1.2 Prior Work 

The asymptotically optimal bounds of 

log*(re) -0(1) < T(n, 3) < log*(re) + 0(1) 

are covered in numerous textbooks and courses on distributed and parallel computing 
[2,4,17,19,20]. The proof is almost unanimously based on the following classical results: 

Cole—Vishkin colour reduction (CV): The upper bound was presented in the mod¬ 
ern form by Goldberg, Plotkin, and Shannon [9] and it is based on the technique 
first introduced by Cole and Vishkin [3]. The key ingredients are a fast colour 
reduction algorithm that shows that T( 2 k ,2k) < 1 for any k > 3, and a slow 
colour reduction algorithm that show that T(k + 1, k) < 2 for any k > 3. By 
iterating the fast colour reduction algorithm, we can reduce the number of colours 
from n to 6 in log*(n) ± 0(1) rounds, and by iterating the slow colour reduction 
algorithm, we can reduce the number of colours from 6 to 3 in 6 rounds (with 
one-sided algorithms). 

Linial’s lower bound: The lower bound is the seminal result by Linial [13]. The 
key ingredient is a speed-up lemma that shows that T(re, 2 C ) < T(n, c) — 1 when 
T(re, c) > 1. By iterating the speed-up lemma for log* (re) — 3 times, we have 
T(re, 4) > T(n,k ) + log* (re) — 3 for a k < re. Clearly T(re, 3) > T(re, 4) and 
T(re, k) > 1, and hence T(re, 3) > log* (re) — 2. 

In the upper bound, many sources—including the original papers by Cole and 
Vishkin and Goldberg et al.—are happy with the asymptotic bounds of log*(re) + 0(1) 
or 0(log* re). However, there are some sources that provide a more careful analysis. The 
analysis by Barenboim and Elkin [2] yields T(n, 3) < log* (re) + 9, and the analysis in 
the textbook by Cormen et al. [4] yields T(re, 3) < log* (re) + 7. In our lecture course [19] 
we had an exercise that shows how to push it down to 

T(re, 3) < log* (re) + 6. 

In the lower bound, there is less variation. Linial’s original proof [13] yields 
T(re, 3) > log* (re) — 3, and many sources [2,11,19] prove a bound of 

T(re, 3) > log* (re) - 2. 

On the side of lower bounds, nothing stronger than Linial’s result is known. There 
are alternative proofs based on Ramsey’s theorem [5] that yield the same asymptotic 
bound of T(re, 3) = 0(log* re), but the constants one gets this way are worse than in 
Linial’s proof. 

On the side of upper bounds, however, there is an algorithm that is strictly bet¬ 
ter than CV: Naor—Stockmeyer colour reduction (NS) [15]. While CV yields 
T(2 fc , 2k) < 1 for any k > 3, NS yields a strictly stronger claim of T(( 2k ), 2k) < 1 for 
any k > 2. However, the exact bounds that we get from NS are apparently not analysed 
anywhere, and their algorithm is hardly ever mentioned in the literature. Hence the 
state of the art appears to be 

log* (re) — 2 < T(re, 3) < log* (re) + 6, 

^ log* (re) - 1 < C(n, 3) < ^ log* (re) + 3. 

Note that we have log* re < 5 for all re < 10 19728 , and hence in practice the constant 
term 6 dominates the term log* re in the upper bound. 
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1.3 Contributions 

In this work we derive exact bounds on C(n, 3) for infinitely many values of n, and 
near-tight bounds for all values of n. We prove that for infinitely many values of n 

C(n, 3) = ^ log* n, 

and for all sufficiently large values of n 

log* (n) - 1 < T(n, 3) < log*(n) + 1. 

With C(n, 3) = |~T(n, 3)/2] this gives a near-complete picture of the exact complexity 
of colouring directed paths. The key new techniques are as follows: 

1. We give a new analysis of NS colour reduction. 

2. We give a new lower-bound proof that is strictly stronger than Linial’s lower 
bound. 

3. We show that computational techniques can be used to prove not only upper 
bounds but also lower bounds on T(n,c), also for the case of a general n and not 
just for fixed small values of n and c. We introduce successor graphs Si that are 
defined so that a graph colouring of Si with a small number of colours implies an 
improved bound on T(n, 3). 

This work focuses on colour reduction, i.e., the setting in which we are given a proper 
colouring as an input. Our upper bounds naturally apply directly in more restricted 
problems (e.g., the input labels are unique identifiers). Our lower bounds results do not 
hold directly, but the key techniques are still applicable: in particular, the successor 
graph technique can be used also in the case of unique identifiers. 

1.4 Applications 

Graph colouring in paths, and the related problems of graph colouring in rooted trees and 
directed pseudoforests, are key symmetry-breaking primitives that appear as subroutines 
in numerous distributed algorithms for various graph problems [1,5,8,9,12,16]. 

One of the most direct application of our results is related to colouring trees: In 
essence, colour reduction from n to c in trees with arbitrary algorithms is the same 
problem as colour reduction from n to c in paths with one-sided algorithms. Informally, 
in the worst case the children contain all possible coloured subtrees and hence “looking 
down” in the tree is unhelpful, and we can equally well restrict ourselves to “looking up” 
towards the root. Hence our bounds on T(n, 3) can be directly interpreted as bounds 
on colour reduction from n to 3 in trees. 

The bounds have also applications outside distributed computing. A result by Fich 
and Ramachandran [6] demonstrates that bounds on C(n, 3) have direct implications 
in the context of decision trees and parallel computing. 

Indeed, the fastest known parallel algorithms for colouring linked lists are just 
adaptations of CV and NS colour reduction algorithms. These algorithms reduce the 
number of colours very rapidly to a relatively small number (e.g., dozens of colours), 
and the key bottleneck has been pushing the number of colours down to 3. In particular, 
reducing the number of colours down to 3 with state-of-the-art algorithms has been much 
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(a) O—►O—*■' 


Xq X'i x 2 X 3 X4 
V outputs A(xq, . . . , X4) 


(b) O—O-* 



u outputs B(x o,..., X 4 ) 


Figure 1: The difference of two-sided and one-sided algorithms, (a) A two-sided 
algorithm A that runs for 2 rounds, (b) A one-sided algorithm B that runs for 2 rounds. 


more expensive than reducing it to 4, but this phenomenon has not been understood 
so far. Prior bounds on T(n, c) have not been able to show that the case of c = 3 is 
necessarily more expensive than c = 4. Our improved bounds are strong enough to 
separate T(n,4) and T(n, 3). 

From the perspective of practical algorithm engineering and programming, this 
work shows that we should avoid CV colour reduction, but we can be content with NS 
colour reduction; the former incurs a significant overhead (e.g., in terms of linear scans 
over the data in parallel computing), but the latter is near-optimal. 

2 Preliminaries 

Sets and Functions. For any positive integer k, we use [k] to denote the set 
{1,2 ,...,k}. For any set X, we use 2 X = { Y C 1} to denote the powerset of 
X. Define the iterated logarithm as 

log® (a;) = x, 

logb + 1 )(x') = logb) (log x) for all i > 0. 

In this work, all logarithms are in base 2. Moreover, the log-star function is 

log* x = min{* : logb) x < 1 }. 

Finally, we define the tetration, or a power tower, with base 2 as 

°2 = 1 , 

* +1 2 = 2 ^ 2 ) for all i > 0 . 


Algorithms. In this work, we focus on algorithms that run on directed paths. We 
distinguish between two-sided and one-sided algorithms; see Figure 1. Two-sided 
algorithms correspond to the usual notion of an algorithm in the LOCAL model: an 
algorithm running for t rounds has to decide on its output using the information 
available at most t hops away. Formally, a two-sided c-colouring algorithm corresponds 
to a function 

A: [n] 2m -> [c]. 
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(a) O—►O—*■' 


v 

Xo X'l x 2 X 3 X4 
v outputs A(xq, . . . , X4) 


(b) O— 


X 0 Xj x 2 x 3 


►@ O-MD 

X4 


u outputs B(xo, ... 5 X 4 ) 


Figure 2: The correspondence between two-sided and one-sided algorithms, (a) A 
two-sided algorithm A that runs for 2 rounds, (b) A one-sided algorithm that runs in 4 
rounds. Both nodes see the same information, so v can easily simulate B and u can 
simulate A. 

Moreover, as A outputs a proper colouring, the function satisfies A(xo ,..., x 2 t) i=- 
A{x 1 ,..., x 2 t+i) when Xi 7 ^ x*+i for all i > 0 . 

In contrast to two-sided algorithms, one-sided algorithms are algorithms in which 
nodes can only send messages to successors. Therefore, a one-sided algorithm that 
runs in t rounds can only gather information from at most t predecessors. Formally, a 
one-sided c-colouring algorithm B that runs for t steps corresponds to a function 

B : [n] t+1 [c], 

which satisfies B{x 0 ,..., xt) 7 ^ B(x 1 ,..., xt+ 1 ) when xt 7 ^ x*+i for all i > 0. 

It is now easy to see that C(n,c) = \T(n,c)/2] holds. Intuitively, the connection is 
straightforward. For example, Figure 2 illustrates how a t-time two-sided algorithm 
can gather the same information as a 2f-time one-sided algorithm. For the sake of 
completeness, we now prove this formally. 

Lemma 2. C(n,c ) = \T{n, c)/2~|. 

Proof. First, we show that T(n,c) < 2C(n, c). Let t = C(n,c ) and A: [n] 2t+l —> [c] 
be a two-sided c-colouring algorithm that runs in time t. We construct a one-sided 
c-colouring algorithm that runs in time 2 1. Recall that a one-sided algorithm can only 
receive messages from predecessors. Initially, every node sends its own colour to its 
successor. Then for 2t — 1 rounds we send the colour received from the predecessor 
to the successor—in the case that a node has no predecessors, the node can simply 
simulate a properly coloured path preceding it. After 2 1 rounds the node knows its own 
colour and the colours of its 2 1 predecessors, that is, a vector (xo,... ,x 2 t) £ [n] 2t+1 . 
Outputting the value A(x 0 ,..., x 2 1 ) yields a proper colouring. 

Second, we show that C(n, c) < \T(n, c)/2~|. Let t = [T(n,c)/2] and B: [n ] T ( n,c ) +1 —> 
[c] a one-sided algorithm that only receives messages from predecessors. Every node 
sends its colour to both neighbours and then forwards any messages in the t — 1 
subsequent rounds. As 2 1 > T(n, c), after t rounds every node knows the colours 
(xo,..., XT( n ,c )) i n its local neighbourhood. Now the node can output B(x 0 ,..., xx{ n ,c)) 
which gives a proper colouring. 

Finally, since the time complexity has to be integral—there are no “half-rounds”—we 
get that C(n,c) = |"T(n,c)/ 2 ~|. □ 
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3 The Upper Bound 


In this section, we bound T(n,c ) from above. To do this, we analyse the Naor- 
Stockmeyer (NS) colour reduction algorithm [15]. The NS algorithm is one-sided, thus 
yielding upper bounds for T(n,c). 

Let us first recall the NS colour reduction algorithm. Let n < ( 2 ^) for some k > 2 
and fix an injection /: [n] —> X , where X = {Y C [2k] : |T| = k}. That is, we interpret 
all colours from [n] as distinct fc-subsets of [2k]. 

The algorithm works as follows. First, all nodes send their colour to the successor. 
Then a node with colour v receiving colour u from its predecessor will output 


A(u,v) = min f{u) \ f(v). 


It is easy to show that if ufivfiw, then A(u,v) £ [2k] and A(u,v) A(v,w ) holds. 
Thus, A is a one-sided colour reduction algorithm that reduces the number of colours 
from ( 2 ^) to 2k colours in one round and we have that T(( 2 ^), 2k ) = 1 for any k >2. 

The above algorithm cannot reduce the number of colours below 4. To reduce the 
number of colours from four to three, we can use the following one-sided algorithm B 
that outputs 


B(u , v, w) 


min{l, 2, 3} \ { u , w} 
v 


if v = 4, 
otherwise. 


The algorithm uses two rounds and this is optimal by Lemma 8 in Section 4. 

We now show the following upper bounds for T(n, c) using the NS colour reduction 
algorithm. 


Lemma 3. The function T satisfies the following: 

(a) T(| • 2 C , | • c) = 1 for any c = Ah, where h > 1, 

(b) T(§ • r+4 2, | • 4 2) < r for any r > 0, 

(c) T(§ • 4 2, 3) < 5. 

Proof. 

(a) As discussed, the NS colour reduction algorithm shows that T (( 2 ^ fc ) ,2k) = 1 for 
k >2. Recall the following bound for the central binomial coefficent 

2 k\ > 4 k 
k ) ~ v/4 k 

and let 2k = 3c/2. Since c > 8 it follows that 

2k\ > (2 • 2) 3c / 4 = 2^ . 2C > 3 2C 
k) ~ Uk \/3c 2 


(b) To show the claim, it suffices to apply part (a) for r times. 

(c) As ( 2 g) > | • 4 2 , we can reduce the number of colours to 4 in three rounds as 

follows: ( 2 [j) (g) ( 2 ) 4. By Lemma 8 , the remaining two rounds can be 

used to remove the fourth colour. □ 


Theorem 4. T( h 2, 3) < T( h 2 + 1, 3) < h + 1 holds for any h > 1. 

Proof. The cases 2 < h < 4 follow from the proof of Lemma 3c. Suppose h = r + 4 for 
some r > 0. By Lemma 3b and c we can get a 3-colouring in r + 5 = h + 1 rounds. □ 
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4 The Lower Bound 


In this section, we give a new lower bound for the time complexity of one-sided colour 
reduction algorithms. The proof follows the basic idea of Linial’s proof [13] adapted to 
the case of colour reduction, but we show a new lemma that can be used to tighten the 
bound. 

The proof is structured as follows. First, we show that T(n, 2 C — 2) < T(n, c ) — 1, 
that is, given a c-colouring algorithm, we can devise a faster algorithm that uses at 
most 2 C — 2 colours; this is just a minor tightening of the usual standard bound, and 
should be fairly well-known. Second, we prove that a fast 3-colouring algorithm implies 
a fast 16-colouring algorithm, more precisely, T(n, 16) < T(n, 3) — 2; this is the key 
contribution of this section. Together these yield the following new bound: 

Theorem 5. For any h > 1, we have T( h 2,3) > h. 

4.1 The Speed-up Lemma 

Lemma 6. If T(n, c) > 1, then T(n, 2 C — 2) < T(n, c) — 1. 

Proof. Let t = T(n,c) and A: [n] t+1 —> [c] be a one-sided c-colouring algorithm. We 
will construct a faster one-sided algorithm B as follows. Consider a node u and its 
successor v. In t — 1 rounds, node u can find out the colours of its t — 1 predecessors 
and its own colour, that is, some vector (xq, ... ,Xt~ i) G [n]*. In particular, node u now 
knows what information node v can gather in t rounds except the colour of v since A is 
one-sided. However, u can enumerate all the possible outputs of v which give the set 

B(x 0 , .. = {A(x 0 ,.. ,,x t -i,y): y / x t -i,y G [n]} C [c]. 

Clearly B(xq, ..., xt-i) 7 ^ 0. We also have B(xq, ... ,xt~i) 7 ^ [c]: For the sake of 
contradiction, suppose otherwise. This would imply that v could output any value in 
[c]. In particular, if u outputs A(z,x 0 , •.. ,xt-i) = a for some z G [n], we could pick 
y G [n] such that A(x 0 ,... ,xt-i,y) = a as well. However, this would contradict the 
fact that A was a colouring algorithm. Hence there exists an injection / that maps any 
possible set B(-) to a value in [2 C — 2]. 

It remains to argue that no two adjacent nodes construct the same set. Suppose a 
node u outputs set X and its successor v also outputs X. Now we can pick k G X such 
that 

A(x 0 ,...,x t -i,y) = k = A(xi,...,x t -i,y,y') 

for some xt-i 7 ^ y 7 ^ y' contradicting that A outputs a proper colouring. Therefore, 
foB is a one-sided (2 C — 2)-colouring algorithm that runs in time t — 1 = T(n , c) — 1. □ 

Lemma 7. For any r > 0, we have T( r+3 2,16) > r + 1. 

Proof. Fix r > 0. We repeatedly apply Lemma 6. Now suppose we have an algorithm 
that reduces the number of colours from n to 16 = 3 2 in r rounds. That is, T(n, 3 2) < r 
holds for some n > 3. From Lemma 6 it follows that 

T(n, 3 2) < r T(n, 4 2 - 2) < r - 1 
T(n, 3+r 2 — 2) < 0, 

but as T(k, k— 1) > 1 for any k it follows that n < 3+r 2. Thus, T( r+3 2,16) > r + 1. □ 
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4.2 Proof of Theorem 5 


In addition to the speed-up lemma, we need a few more lemmas that bound T(n, 3) 
below for small values of n. 

Lemma 8 . T(4, 3) > 2. 

Proof. Let B': (u,v) — > {1, 2, 3} be a one-sided 3-colouring algorithm that runs in one 
round. Now B' yields a partitioning of the possible input pairs (u, v) where m/d. It 
is simple to check that there always exists a pair (u, v ) with m/d such that there also 
exists some zu / v satisfying B'{u,v) = B'(v,w). □ 

Lemma 9. T(16,3) > 3. 

Proof. As observed by Linial [13], we can show C(n, c) = t if the so-called neighbourhood 
graph M n ,t has a chromatic number of c. While Linial analytically bounded the chromatic 
number of such graphs, we can also compute their chromatic numbers exactly for small 
values of n, c, and i; see [18] for a detailed discussion. We use the latter technique to 
show the claimed bound. That is, the neighbourhood graph A/ 7.1 is not 3-colourable. 
The neighbourhood graph A /71 = (V, E) is defined as follows. The set of vertices is 

V = {(xo,xi,x 2 ) € [n] 3 : x 0 7 ^ Xi / x 2 ,x 0 / x 2 }, 

where n = 7 and the set of edges is 

E = {{w,u} : u,v E V,u = (x 0 ,xi,x 2 ),v = (x 1 ,x 2 ,x 3 )}. 

It is easy to check with a computer (e.g. using any off-the-shelf SAT or an IP solver) 
that the graph A/ 7,1 is not 3-colourable. Therefore, C( 7,3) > 1 and in particular 
T(16,3) >T(7,3) >2. □ 

To get a lower bound for 3-colouring, we show in the following sections that the 
existence of a t-time one-sided 3-colouring algorithm implies a (t — 2)-time one-sided 
16-colouring algorithm. 

Lemma 10. For any n > 16, it holds that T(n, 16) < T(n, 3) — 2. 

Now we have all the results for showing the lower bound. 

Theorem 5. For any h > 1, we have T( h 2,3) > h. 

Proof. The cases r = 2 and r = 3 follow from Lemmas 8 and 9. For the remaining cases, 
let h = r + 3 for some r > 0. Suppose T( h 2, 3) = T( r+3 2, 3) < h. Then by Lemma 10 
we would get that T( r+3 2, 16) < h — 2 = r + l which contradicts Lemma 7. □ 

4.3 Proof of Lemma 10 via Successor Graphs 

To prove Lemma 10, we analyse the chromatic number of so-called successor graphs —a 
notion similar to Linial’s neighbourhood graphs [13]. In the following, given a binary 
relation R , we will write x € R(y) to mean (y, x) G R. 



Colouring Relations. Suppose A = Aq is a one-sided 3-colouring algorithm that 
runs in t rounds. Let Ai, ..., At denote the one-sided algorithms given by iterating 
Lemma 6 and C k+ \ Q 2 c ' fc be the set of colours output by algorithm A k+ \. 

In the following, let t! = t — k. Define the potential successor relation S k C C k x C k 
to be a binary relation such that (x, y) £ S k if there exists xo,..., Xf where Xj 7 ^ x,; + 1 
such that 

A k (x 0 , ■ ■ . ,x t /_i) = x and A k (x 1 ,.. ,,x t >) = y. 

That is, in the output of algorithm A k there can be an x-coloured node with a successor 
of colour y. Moreover, define the output relation R k C C k x C k+ \ such that (x, X) £ R k 
if 

A k (x 0 , • • • ,x t /_ 2 ,x) = X 

for some xo,..., x t /_2 where x t / Xj+i- That is, a node with colour x can output colour 
X when executing A k+ \. From the construction of A k+ 1 given in Lemma 6, we get 
that R k = {(x, X) : X C S k (x),X / 0}. 

Lemma 11. Suppose X £ R k (x), Y £ R k (y), and y £ X for some x. y £ C k , then 
(X, Y ) £ Sfc_(_i holds. Moreover, the converse holds. 

Proof. As we have y £ X C S k (x), this means that a node with colour x may have 
a successor of colour y after executing algorithm A k . Moreover, as X £ R k (x) and 
Y £ R k (y ) hold, then a node with colour x may output X and node with colour y may 
output Y when executing A k+ \. Thus, after executing A k+ \ we may have a node with 
colour X that has a successor with colour Y. Therefore, (X, Y) £ S k+ \. 

To show the converse, suppose that (X, Y) £ S k+ \ , that is, in some output of A k+ i 
a node u with colour X having a successor v with colour Y. Now there must exist 
some colour x that X £ R k (x) and some colour y such that Y £ R k (y). As v is a 
successor of u, the algorithm A k+ \ outputs a set X consisting of all possible colours for 
any successor of u, and thus, we have y £ X. □ 

Successor Graphs. For any choice of A = Aq, we can construct the successor 
relation S k and using this relation, we can define the successor graph of A to be the 
graph S k (A) = ( C k ,E k ), where E k = {{x,y} : (. x,y ) £ S k }- These graphs have the 
following property: 

Lemma 12. Let S k = (C k , S k ) be the successor graph of A, and let t be the running 
time of A. If f:C k ^>- [%] is a proper colouring of S k , then f o A k is a one-sided 
X-colouring algorithm that runs in t — k rounds. That is, T(n,x ) <t — k. 

Proof. Let u be the predecessor of v on a directed path. Now by definition, 

A k (x 0 , ... ,x t _i ,u) = x / y = A k (xi, .. .,x t -i,u,v) 

=>• {x,y) £ S k ==> f{x)^f{y). 

Therefore, / o A k is a one-sided y-colouring algorithm. □ 

In the next section, we show the following lemma from which Lemma 10 follows. 

Lemma 13. For any t-time 3-colouring algorithm A, the successor graph £ 2 (A) can 
be coloured with 16 colours. 

In particular, this holds for an optimal algorithm A with a running time of t = 
T(n, 3). Together with Lemma 12, this implies Lemma 10. We next show how to prove 
Lemma 13 in two ways: with computers, and without them. 
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4.4 A Human-Readable Proof of Lemma 13 

We start by giving a traditional human-readable proof for Lemma 13. That is, we argue 
that for any one-sided 3-colouring algorithm A = Aq the successor graph 52(A) can be 
coloured with 16 colours. Later in Section 4.5, we give a computational proof of the 
same result. In the following, we fix A and denote S 2 = 52 (A) for brevity. 

Structural Properties. We start with the following observations. 

Remark 1. Sets Co and C\ satisfy 

Co C {1,2,3}, 

Ci C {{1},{2},{3},{1,2},{1,3},{2,3}}. 

Remark 2. Relation 5i satisfies 

S\(i) C {X e Ci : i i A}, 
5i({i,j})c{AGCi:{i,j}^A}. 

Remark 3. Consider any ICCi with {{1,2}, {1,3}, {2,3}} C A. Then there is no 
x € Ci with X C 5] (x). Therefore A 2 cannot output colour X, and hence X ^ C 2 . 

Hence graph 52 has | C 2 1 < 55 nodes: out of the 2 6 = 64 candidate colours, we can 
exclude the empty set and 8 other sets identified in Remark 3. We will now partition 
the remaining nodes in 16 colour classes (independent sets). 

Colour Classes. There are four types of colour classes. First, for each 0 7 ! X C [3] 
we define a singleton colour class 

L„(I) = {{{r}:iel}}, 

that is, an independent set of size 1. Then for each triple 

€ {(1,2,3), (1,3,2), (2,3,1)} 

we have three colour classes: 

Ai (i,j,k) = |A G C 2 : {i, k}} C A C {{i,j},{i,k},{i},{j},{k}}^ 

X2(i,j,k) = |A E C 2 : {{*, j}, {&}} C A C {{*, j}, {*}, {j}, {A:}}}, 

A 3 = | IeC 2 : {{Lj}} ^ A C {{z, j}, {*}, {j}}}. 

In total, there are 7 singleton colour classes, and 3x3 other colour classes, giving in 
total 16 colour classes. Figure 3 shows the complement of a supergraph of 52; each of 
the above colour classes correspond to a clique in the complement graph. 

It can be verified that each of the 55 possible nodes of S 2 is included in exactly 
one of the colour classes. It remains to be shown that each colour class is indeed an 
independent set of 52- 

The singleton classes form independent sets trivially. We handle each type of the 
remaining colour classes separately. Recall that there is an edge {A, Y } in S 2 if either 
A G S 2 (Y) or Y G 5 2 (A). 
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Figure 3: This illustrations shows the complement of a graph we call <S|. For any 
algorithm A, the successor graph £ 2 (^ 1 ) is a subgraph of <S|, and hence, a proper 
colouring of 5^ is a proper colouring of £2 (^4)- Each clique in the figure corresponds to 
a colour class in «S|. We use a shorthand notation: for example, the circle labelled with 
“1 2 12” is the node {{1}, {2}, {1, 2}}. 



Lemma 14. The class X\ (i,j, k) forms an independent set in S< 2 - 

Proof. Let X = X\(i,j,k). Observe that for any X € X we have {{?', j}, {*, A;}} C X 
and { j, k} (f X. From Remark 2 it easily follows that the relation S\ satisfies 

C X C 2 51 ^ =4> x = {j,k}. 

In particular, we get that X € X => X £ Ri({j,k}). 

In order to show that X is an independent set in S- 2 , let Y and Z be vertices of 
S 2 such that Y £ S- 2 {Z). First, if Y £ X, then Y £ Ri({j, k}). In particular, this 
means that {j, k} £ Z and we get that Z ^ X. For the second case, suppose {j, k} ^ Z. 
This means that a node with colour Z cannot have a successor with colour {j, k} in 
a colouring produced by A\. Thus, it must be that Y ^ Ri({j, k}). By the earlier 
observation, we get that Y ^ X. □ 

Lemma 15. The class X 2 (i,j,k) is independent in £ 2 - 

Proof. Let X = j, k). In this class, for every X £ X it holds that {{i, j}, {&:}} C X. 
From Remark 2 it follows that 

^ X C 2 Sl ^ => x £ {{i,k},{j,k}} . 

Thus, X € X => X e R!({i,k})U R!{{j,k}). 

In order to show that the class X forms an independent set in S 2 , suppose Y £ ) 

for some Y, Z £ C' 2 . First, if Y £ X, then we have that either {i, k} £ Z or {j, k} £ Z 
so Z X. Second, if Z £ T, then {{i, &}, {j, A:}} n Z = 0. This means that a node 
with colour Z cannot have successor of colour {i, k} or {j, k} as a successor, hence 

Y ^ «!({», fc})Ui2i({j, A:}). □ 

Lemma 16. The class X 3 (i,j, k) forms an independent set in S- 2 - 

Proof. Let X = X-^fi. j. k). Observe that for any X £ X it holds that {i,j} £ X and 
{{i, A;}, {j, k}, {/c}} 01 = 1. Using Remark 2 we can check that relation S\ satisfies 

X c 2 SlW x = k. 


Thus, X € X =► X eRi{k). 

To see that X is an independent set in 1 S 2 , let Y £ S\{Z). There are two cases to 
consider. First, if Y £ X, then Y £ Ri(k). That is a node with colour k can output 
colour Y. Thus, k £ Z, so we must have that Z ^ X. Second, if Z £ X, then k f Z 
which means that Y ^ R\(k). Thus, Y £ X. □ 

4.5 Computational Proof of Lemma 13 

We now give a computational proof of Lemma 13, that is, we show how to easily 
verify with a computer that the claim holds. Essentially this amounts to checking that 
for every choice of A = Aq, the successor graph £ 2 (^ 1 ) is colourable with 16 colours. 
However, since any successor graph ^(R) depends on the choice of initial one-sided 
3-colouring algorithm A = Aq, and there are potentially many choices for A, we instead 
bound the chromatic number of a closely-related graph <S| that contains £ 2 (^ 1 ) for any 
A as a subgraph. 
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To construct the graph 5|, we consider the successor graph of a “worst-case” 
algorithm that may output “all possible” colours in its output set. Specifically, this 
means that we simply replace the subset relation in Remarks 1 and 2 with an equality. 
Therefore, the graph 5| can be constructed using a fairly straightforward computer 
program, with a mechanical application of the definitions. The end result is a dense 
graph on 55 nodes; its complement is shown in Figure 3. 

It is now easy to discover a colouring of graph 5| that uses 16 colours with the 
help of e.g. modern SAT solvers. This implies that any subgraph 52 (A) can also be 
coloured with 16 colours and Lemma 13 follows. 

5 Main Theorems 

We now have all the pieces for proving Theorem 1: 

Theorem 1. For infinitely many values ofn, it takes exactly \ log* n rounds to compute 
a 3- colouring of a directed path. 

Proof. Let n = 2fc+1 2 + 1 for any k > 2. Be Lemma 2 we have the identity 

C(n,3) = |~T(n, 3)/2~| (1) 

and from Theorems 4 and 5 we get that 

2k + 1 < T(n, 3) < 2 k + 2, 

which together with (1) yields C(n, 3) = k + 1. Since log* n = 2 k + 2 it follows that 
C(n, 3) = k + 1 = log* n/2. □ 

For the remaining values of n we get almost-tight bounds. There remains a slack of 
one communication round in the upper and lower bounds for C(n,3). 

Theorem 17. For any n > 4, 

7 } ( lo S* n-1) < C(n, 3) < ^ (log* n + 1) . 

Proof. For n = 4, we have shown that T(4,3) = 2 so the bounds follow. Fix n > 4. 
Now there exists some h > 1 such that n G { h 2 + 1,..., h+l 2} and h = log* n — 1. 
Theorems 4 and 5 give us the bounds 

log* n — 1 = h < T(n, 3) < h + 2 = log* n + 1 

and since C(n, 3) = |~T(n, 3)/2], the claimed bounds follow. □ 

6 Conclusions and Discussion 

In this work we gave exact and near-exact bounds on the complexity of distributed 
graph colouring. The key result is that the complexity of colour reduction from n to 3 
on directed paths and cycles is exactly ^ log* n rounds for infinitely many values of n, 
and very close to it for all values of n. 
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In essence, we have shown that the colour reduction algorithm by Naor and Stock- 
meyer is near-optimal, while the algorithm by Cole and Vishkin is suboptimal. We 
have also seen that Linial’s lower bound had still some room for improvements. 

One of the novel techniques of this work was the use of computers in lower-bound 
proofs. Two key elements are results of a computer search: 

• Lemma 9: The proof of T(16, 3) > 3 is based on the analysis of the chromatic 
number of the neighbourhood graph A/ 7 , 1 . 

• Lemma 10: The proof of T(n, 16) < T(n, 3) — 2 is based on the analysis of the 
chromatic number of the successor graph S 2 . 

In both cases we used computers to analyse the chromatic numbers of various successor 
graphs and neighbourhood graphs, in order to find the right parameters for our needs. 

The idea of analysing neighbourhood graphs and their chromatic numbers is 
commonly used in the context of human-designed lower-bound proofs [7,10,13,14]. It 
is also fairly straightforward to construct neighbourhood graphs so that we can use 
computers and graph-colouring algorithms to discover new upper bounds [18], and the 
same technique can be used to prove lower bounds on T(n,c ) for small, fixed values of 
n and c; in our case we used it to bound T(16,3). However, this does not yield bounds 
on, e.g., T(n, 3) for large values of n. 

The key novelty of our work is that we can use the chromatic number of successor 
graphs to give improved bounds on T(n, 3) for all values of n. To do that, it is sufficient 
to find a successor graph Sk with a small chromatic number, apply Lemma 12. The 
same technique can be also used to study T(n, c ) for any fixed c > 3. 
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